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Abstract 



The process of evolutionary diversification unfolds in a vast genotypic space of potential 
outcomes. During the past century t here have been remarkable advances in the develop- 
ment of theory for this divers i fication f^Fisher!. 1930 : Wright . 1984 : Hofbauer and Sigmund . 



19881 : iLvnch and WalsM Il998l : iBiirged . 12000, : .Ewensl . I2004J : Barton et al.l . I2007D , and the the- 



ory's success rests, in part, on the scope of its applicability. A great deal of this the- 
ory focuses on a relatively small subset of the space of potential genotypes, chosen largely 
based on historical or contemporary patterns, and then predicts the evolutionary dynam- 
ics within this pre-defined set. To what extent can such an approach be pushed to a 
broader perspective that accounts for the potential open-endedness of evolutionary diversifi- 
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far such theory can be pushed has not been addressed. Here a theorem is proven demonstrat- 
ing that, because of the digital nature of inheritance, there are inherent limits on the kinds 
of questions that can be answered using such an approach. In particular, even in extremely 
simple evolutionary systems a complete theory accounting for the potential open-endedness 
of evolution is unattainable unless ev olution is progressive. The theorem is c l osely r elated 
to Godel's Incomp l etenes s Theorem (iGodell . Il93ll : iNagel and Newmad . Il958t iDavid [l965; 



19361 : ICutlandl . ll980f ). 



van Heiienoort ed.l. ll967D and to the Halting Problem from computability theory ( iTuringl . 



1 Introduction 



2 Much of evolutionary theory is, in an important sense, fundamentally historical. The 

3 process of evolutionary diversification unfolds in a vast genotypic space of potential 

4 outcomes, and explores some parts of this space and not others. Nevertheless, a great deal 

5 of current theory restricts attention to a relatively small subset of this space, chosen largely 

6 based on historical or contemporary patterns, and then predicts evolutionary dynamics. 

7 Although this can work well for making short-term predictions, ultimately it must fail once 

8 evolution gives rise to genuinely novel genotypes lying outside this predefined set 



9 ( lYedid and Bell 



20021). 



This potential limitation on the predictive ability of many models of evolution has been 



11 noted on various occasio r is throughout the development of evolutionary theqrv (|Levinton 
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Fontana and Buss 
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Wagner and Altenberg 



1996 



Yedid and Bell 



20021), 



13 perhaps most famously by Dutch biologist Hugo DeVries when he remarked that "Natural 

14 selection may explain t he survival of the fittest, but it cannot explain the arrival of the 



15 fittest" fiDeVries 



1904J ). Such statements hint at the notion that many models of evolution 

16 are what we might call 'local', or 'closed', in the sense that they focus attention on a very 

17 small (local) region of the evolutionary tree and do not account for the possibility that 

18 evolution is an open-ended process. 



19 The distinction between 'closed' and 'open-ended' models of evolution will be discussed in 

20 more detail below, but in recent years there have been several interesting studies published 

21 that are beginning to push the boundaries of analyses towards what we might naturally 

22 call open-ended models. These studie s include models of abstrac t repli cator populations 
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30 Similarly, there have also been many in sil ico and artificial li: 

31 generic, emergent, properties of eyolutio n ( 



Yedid and Bell. 



Chow et al. 
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20091 ). In general these analyses have demonstrated that, once we allow for more 



35 open-ended evolution, a much richer suite of evolutionary possibilities arises. 

36 The above studies collectively suggest that accounting for open-ended evolution i n theory 
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39 theoretical studies that allow for open-ended evolution, and so we might expect that much 

40 is yet to be learned by broadening evolutionary theory further in this way. My purpose 

41 with this article is therefore twofold. First, I simply wish to highlight the fact that there is 

42 an important distinction to be made between open-ended versus closed models of evolution 

43 (defined more precisely below), and to suggest that open-ended models might more 

44 faithfully represent the evolutionary process. Second, and more significantly, I wish to 

45 consider whether a push towards a predictive theory that embraces the potential 

46 open-endedness of evolution is likely to face additional obstacles, over and above those 

47 faced by closed models of evolution. Put another way, I ask the question: To what extent is 

48 the development of a predictive, open-ended evolutionary theory possible? 

49 Although a complete answer to the above question is not possible, in what follows I will 

50 provide at least a partial answer. Furthermore, I demonstrate that this answer has 

51 interesting connections to the Halting Problem from computability theory and to Godel's 
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52 Incompleteness Theorem from mathematical logic. In particular, I will use results from 

53 these areas to prove a theorem that formally links the concept of progressive evolution to 

54 the possibility of developing such a predictive ope n-ended theory. There r emain s debate 



55 over if, and when, evolution might be progressive (jPawkins 



Adami et al 



1997 



Gould 



19971 : 



20001) and part of this debate stems from the lack of a precise yet general 
57 definition of progression. Thus, another way to view the results presented here is as 
5B providing such a definition. I will return to this point more fully in the discussion. 



59 A Motivating Example 

60 To sharpen the focus on these somewhat abstract ideas, it is worth beginning with a 

61 concrete motivating example involving evolutionary prediction. This section does so, 

62 focussing primarily on the broad conceptual issues involved. The section that follows then 

63 addresses these issues more precisely. 

64 Consider trying to use evolutionary theory to predict the dynamics of human influenza. 

65 Specifically, consider trying to answer the following question: is it likely that a pandemic 

66 with the 1918 Spanish influenza strain will ever occur again? This is obviously a difficult, 

67 and still somewhat loosely defined, question so let's narrow things down further. One 

68 reason we might be skeptical about our ability to make such predictions is because of 

69 uncertainty in initial conditions and parameter values, as well as uncertainty about the 

70 evolutionary processes involved. In other words, perhaps we lack all of the information 

71 required to make such predictions. Furthermore, unexpected contingencies might thwart 

72 what would otherwise be accurate predictions. For example, an unanticipated volcanic 

73 eruption might temporarily alter commercial air travel patterns, and this might thereby 

74 alter the epidemiological and evolutionary dynamics of infiuenza. 

75 These practical limitations are clearly important, but are they the only obstacle to making 
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76 accurate evolutionary predictions or are there other, 'inherent', hmitations as well. Does 

77 the difficulty of making evolutionary predictions stem simply from our lack of knowledge of 

78 the evolutionary processes involved or are there reasons why, even in principle, such 

79 evolutionary predictions are not possible? 

80 It is this latter question that is the focus of this article, and therefore I will, at least 

81 temporarily, put the above practical concerns aside. Specifically, let's assume that we can 

82 build a model that adequately captures all of the relevant evolutionary processes, and that 

83 we can obtain all parameter estimates necessary to use such a model. Without getting too 

84 much into the specifics, one of the ffist things we would need to decide is the relevant strain 

85 space for the model. The simplest scenario would consider only two strains (e.g., the 1918 

86 strain and the current, predominant, strain). More sophisticated scenarios might instead 

87 include several strains that are thought to be important in the dynamics. In either case, 

88 both such resulting models would be 'closed' in the sense described in the introduction 

89 because they focus only on a finite (and relatively small) number of strains. Furthermore, 

90 given that there is a discrete and finite number of people who can be infected at any given 

91 time, there is then also a finite (and relatively small) number of possible evolutionary 

92 outcomes. As will be detailed more precisely later, this then implies that the process will 

93 either reach a steady state or it will display periodic behaviour (see Appendix [5]) . Hence, if 

94 a closed model is an accurate description of the evolutionary process, then in principle we 

95 can answer the above question by simply running the model until one of these two outcomes 

96 occurs. At that point we need only observe whether or not a 1918 Spanish flu pandemic 

97 ever occurred during the run of the model (or if it occurred with significant probability). 

98 But what if the evolutionary process is, instead, open-ended? To explore this possibility we 

99 need to be more specific about what is meant by open-ended. Consider again the influenza 

100 example. Influenza A has a genome size of more that 12,000 nucleotides, and therefore the 

101 number of possible genotypes is enormous. To gain some perspective on just how many 
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102 genotypes are possible, let's restrict attention to only the smallest of the eight genomic 

103 segments of influenza. In this case there are then only approximately 800 nucleotides and 

104 therefore approximately 4®*^° different possible genotypes. To put this number in 

105 perspective, it is approximately 10'^°'^ times larger than the estimated number of atoms in 

106 the universe. For a model to be open-ended it would have to allow for such a vast set of 

107 possible evolutionary outcomes so that, as in reality, evolutionary change could continue 

108 unabated, producing potentially novel outcomes essentially indefinitely. The simplest way 

109 we might try to capture this theroetically is to assume that the space of possible genotypes 
no is infinite. 

111 Given these considerations, if evolutionary theory is to capture an open-ended evolutionary 

112 process, then its state space must be effectively infinite. This is necessary but it is not a 

113 sufficient condition for open-ended evolution. For example, many stochastic Markovian 

114 models in population gen etics have an infinite state space (e.g., the infinite alleles model; 



115 iKimura and Growl (1l964j )) but nevertheless do not display open-ended evolution. Rather, 

116 further assumptions are often made, such as the assumption that the Markov chain is 

117 irreducible and positively recurrent. These assumptions are usually made primarily for 

118 mathematical convenience but they rule out the possibility of open-ended evolution since 

119 they then guarantee the existence a single unique equilibrium or stationary distribution. 

120 As a result, such models cannot capture the possibility that evolutionary change might 

121 continue indefinitely. 

122 What if we relax these assumptions and allow for truly open-ended evolution in the theory 

123 that we develop? Are there then even further problems associated with making 

124 evolutionary predictions? For example, does this make answering the question about 

125 infiuenza evolution laid out at the start of this section more difficult? You might suspect 

126 that the answer is 'yss'; at least, the approach suggested above for closed models will no 

127 longer suffice because the evolutionary process is no longer guaranteed to settle down to an 
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12B equilibrium or stationary distribution. Thus, the best we can possibly hope for is that there 

129 is some way to prove, using the structure of the model, whether or not such an outcome will 

130 occur. Thus, all practical difficulties of predicting evolution aside, it is not obvious whether 

131 we can answer the above sort of question about influenza evolution, even in principle. 

132 These issues are now starting to tread heavily into the fields of computability and 

133 mathematical logic and, roughly speaking, a theory that can answer the above kind of 

134 question about infiuenza evolution is referred to as a negation-complete theory. This 

135 terminology refiects the idea that the theory is complete in the sense of one being able to 

136 determine whether a given statement is true, or whether its formal negation is true instead. 

137 For example, in the context of influenza, a negation-complete theory would be able to 
13B predict whether the statement 'the Spanish flu will happen again' is true or whether its 

139 formal negation 'it is not true that the Spanish flu will happen again' is true instead. More 

140 generally, a negation-complete evolutionary theory would be one from which we could 

141 determine those parts of genotypic space will be explored by evolution and those that will 

142 not. 

143 Is such a negation-complete theory possible once we allow for open-ended evolution? In the 

144 remainder of this article I show that the answer to this question is closely related to the 

145 idea of progressive evolution. In particular, even if the system of evolution were simple 

146 enough for us to understand everything about how its genetic composition changes from 

147 one generation to the next, the following theorem is proven: 

14B Theorem: A negation- complete evolutionary theory is possible if, and only if, the 

149 evolutionary process is progressive. 

150 The above theorem will be made more precise shortly, but as already alluded to above, it 

151 stems from the fact that DNA affords evolut ion a mechanism of digital inher i tance . As 



152 Maynard Smith and Szathmary have noted (IMaynard Smith and Szathmaryl . 



19951) the 
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153 combinatorial complexity that arises thereby allows evolution to be effectively open-ended. 

154 Indeed, as will be argued below, digital inheritance allows one to characterize evolution 

155 (i.e., the change in genetic composition of a population) as a dynamical system on the 

156 natural numbers, and therefore the theorem proved below holds for any such dynamical 

157 system, not just those meant to model evolution. As a result, the theorem is closely related 



15B to other results 



159 Theorem ( iGodel 



Tom mathematics and computer science; name 



1931 



Nagel and Newman 



1958 



Davis 



160 and to the Halting Problem from computability theory ( iTuring 



1965 



y Godel's Incompleteness 



van Heiienoort ed. 



193 



Cutland . 



19671 ) 



19801). 



161 Statement and Proof of Theorem 

162 In order to give precision to the above theorem, we must specify what is meant by 'the 

163 evolutionary process', as well as what it means for evolutionary theory to be 

164 negation-complete. The goal is to determine if, even in extremely simple evolutionary 

165 processes, there is some inherent limitation on evolutionary theory. 

166 To this end, consider a simplified evolutionary process in which there is a well-mixed 

167 population of replicators with some maximal population size, and in which each replicator 
16B contains a single piece of DNA. This genetic code can mutate in both composition, and in 

169 length, with no pre-imposed bounds. Suppose that each replicator survives and reproduces 

170 in a way that depends only on the current genetic composition of the population. For 

171 additional simplicity, suppose that generations are discrete. All conclusions hold if events 

172 occur in continuous time instead (Appendix [5]) . Finally, for simplicity of exposition, I will 

173 usually assume that the evolutionary dynamics are deterministic in the main text. Again, 

174 all results generalize to the case of stochastic evolutionary dynamics, albeit with a few 

175 additional assumptions (Appendix |5]). 

176 With the above evolutionary dynamic, the genetic composition of the system will evolve 
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177 over time, and we can characterize the state of the system at any time by the number of 

178 each type of rephcator (e.g., the number of infections with each possible genotype of 

179 influenza). The goal then is to determine if it is possible to construct an evolutionary 

180 theory that can predict which parts of the space of potential evolutionary outcomes will be 

181 explored during evolutionary diversification, and which will not. Formally, the results 

182 presented below are valid for any theory whose derived statements are recursively 

183 enumerable. Axiomatic theories are one such example but (roughly speaking) any 

184 theoretical approach that can, in principle, be implemented by a computer falls into this 

185 category (Appendix [1]). Indeed, the statement and proof of the theorem relies on several 

186 ideas from computability theory (Appendix [2]) . 

187 The digital nature of inheritance provided by DNA means that, in principle, the number of 

188 distinct kinds of replicators that are possible is discrete and unbounded, a property 

189 Maynard Smith and Szathmary refer to as 'indefinite' heredity 



190 ( iMaynard Smith and Szathmary 



19951 ). It is indefinite heredity that allows for open-ended 

191 evolution. As a result, in principle, the set of possible population states during evolution is 

192 isomorphic to the positive integers; i.e., there exists a one-to-one correspondence between 

193 the set of possible population states and the positive integers. Such sets are called 

194 denumerable, and in fact the set of population states is effectively denumerable in a 

195 computability sense (Appendix [3]) . Thus we can effectively assign a unique integer- valued 

196 'code' to every possible population state. 

197 In practice, of course, there are limits on the number of kinds of replicators possible, if only 

198 because of a finite pool of the required chemical building blocks. Nevertheless, as 

199 mentioned earlier the combinatorial nature of indefinite heredity means that the actual 

200 number of possible population states is so large as to be effectively infinite. For simplicity 

201 of exposition, it is assumed in the main text that the set of possible population states is 

202 truly infinite; however. Appendix [6] makes the notion of 'effectively infinite' precise and 
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203 provides the analogous results for this case. 

204 With the above coding we can formalize evolution mathematically as a mapping of the 

205 positive integers to themselves. For example, in the deterministic case we might start with 

206 a model (e.g., a mapping F) that tells us the number of individuals of each genotype in the 

207 next time step, as a function of the current numbers. Then, under the above coding, if 

208 E[n) denotes the population state (formally, its integer code number) at time n, the model 

209 can be recast as a single- variable, integer, mapping E{n + 1) = G{E{n)) for some function 

210 G, along with some initial condition. Similarly, in the stochastic case, if we start with a 

211 probabilistic mapping F, then it can be recast as a mapping E{n + 1) = H{E{n)) where H 

212 gives the probability distribution over the set of code numbers in the next time step as a 

213 function of its current distribution (and E is then a vector of probabilities over the 

214 integers). Therefore, in general, we can view the evolutionary trajectory as being simply an 

215 integer- valued function with an integer- valued argument. Of course, different ways of 

216 coding the population states will correspond to different maps, G or if, and thus different 

217 functions E{n). Also note that the domain of G or i/ need not be all of the positive 

218 integers, and in fact different initial conditions might give rise to different domains as well. 

219 This would correspond to there being different basins of attraction in the evolutionary 

220 process. 

221 It is also worth noting that, although we have assumed the evolutionary mapping (i.e., G or 

222 if) is a function of the current genetic composition of the population only, we can relax this 

223 assumption and allow evolutionary change to depend on other aspects of the environment 

224 as well. In particular, we might expand our definition of 'population state' to include both 

225 genetic state, and the state of other variables associated with the environment in which the 

226 genes exist. Again, as long as such generalized processes can be recast as dynamical 

227 systems on the natural numbers, all of the results presented here continue to hold. 

228 The above arguments illustrate how we can view evolution as a dynamical system on the 
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229 natural numbers, and they also now allow us to formalize the notion of open-ended 

230 evolution. In the deterministic setting evolution is open-ended if the mapping G never 

231 revisits a previously visited state. Likewise, in the stochastic setting, evolution is 

232 open-ended if the mapping H always admits at least one new state each generation with 

233 positive probability. 

234 Because we can view evolution as a dynamical system on the natural numbers, 

235 evolutionary theory can be viewed as a set of specific rules for manipulating and deducing 

236 statements about such numbers. Computability theory deals with functions that map 

237 positive integers to themselves, and thus provides a natural set of tools to analyze the 

23B problem. A function is called 'computable' if there exists some algorithmic procedure that 

239 can be followed to evaluate the function in a finite number of steps (Appendix |2]). 

240 Again, focusing on the deterministic case, given the assumption that we are able to predict 

241 the state of the population from one time step to the next, the function E{n) is 

242 computable ( see Appendix [2l) . Furthermore, the set of all computable functions is 



243 denumerable (ICutland 



19801 ). Therefore, denoting the k^^ such function by <f)k{i^), it is 

244 clear the evolutionary process, E{n), must correspond to a member of this set. Denote this 

245 specific member by (f)E{n), and again note that, if we change the integer-coding used to 

246 identify specific population states, we will obtain a different function E{n), and thus a 

247 different member of the set, (f)^{n) (Fig. 1). 

24B During evolution, a set of population states will be visited over time (in the stochastic case 

249 we consider a state as being visited if the probability of it occurring at some point is larger 

250 than a threshold value; Appendix [S]). These will be referred to as 'evolutionarily attainable' 

251 states. In terms of our formalism, this corresponds to the function 0£;(n) taking on various 

252 values of its range. Re, as n increases (Fig. 1). A negation-complete evolutionary theory 

253 would be one that can determine whether a code, x, satisfies x G Re or whether it satisfies 

254 X ^ Re instead. In the language of computability theory, this corresponds to asking 
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255 whether the predicate 'x G Re^ is decidable (Appendix [2l (jCutlandl . 119801 ) ). In terms of the 

256 influenza example presented earlier, if x is the population state corresponding to a 

257 pandemic with the 1918 strain, then the statement 'the Spanish flu will happen again' 

25B corresponds to the number-theoretic statement x G Re- Likewise, the statement 'it is not 

259 true that the Spanish flu will happen again' corresponds to the number-theoretic statement 

260 X ^ Re- 

261 Lastly, we can give a precise definition of progressive evolution. Intuitively, evolution is 

262 progressive if there is some quantifiable characteristic of the population that increases 

263 through evolutionary time. In terms of the above formalization, this means there is a way 

264 to recode the population states such that the code number increases during evolution. 

265 Formally, evolution is progressive if there exists a computable, one-to-one, coding of the 

266 population states by positive integers, C, such that the corresponding description of the 

267 evolutionary process, (j)^{n), satisfies (PeI^ + 1) > 4'e{^) fo^^ ^- Again, in terms of the 
26B influenza example presented earlier, if evolution were progressive, then there would be some 

269 way to a priori code the population states such that, as influenza evolution occurs, the 

270 code number of the population increases (I will return to this definition of progression in 

271 more detail in the discussion). 

272 We can now rephrase the theorem in terms of precise, technical, language: 

273 Theorem: 'x G Re ' is decidable if, and only if, there exists a computable, one-to-one, 

274 coding of the population states by positive integers, C , such that the corresponding 

275 description of the evolutionary process, 4>e{^)! satisfies + 1) > 'pEiji) for all n. 

276 Proof (Figure 1; see Appendices [2] and H] for additional details): 

277 Part 1: If there exists a coding C such that 0g(n + 1) > (Pe{^) ^.11 n then the predicate 
27B 'x G Re' is decidable. 
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279 By hypothesis there exists a computable bijection C such that, for the corresponding 

280 description of the evolutionary process, (j)j^{n + 1) > (pj^iji) for all n. For any population 

281 state, X, in the original coding, let x be the corresponding code under the bijection C*, and 

282 define z{x) = fii{(j)^{i) > x), where fii{H{i)) denotes the minimum value of i for which the 

283 argument H{i) is true (Appendix [2]) . Further, define Rk{n) = {x : (pki^i) = x,i <n} (i.e., 

284 the range of (j)k{n) visited by step n; Appendix [2]). Clearly 'x G Rji,{z{x)y is decidable since 

285 Rf^{z{x)) is finite and can be enumerated, and furthermore x G Rji.{z{x)) x E -R^. owing 

286 to the progressive nature of evolution. Therefore, 'x E R^^ is decidable as well. Finally, 

287 using S denote the set of population states that are evolutionarily attainable, we have that 

288 X G R^ -x^ C^^x E S 4^ CC^^x G Re- Noting that, by definition, x = CC^^x, we obtain 

289 X G -Rg 4=^ X G Re- Thus, 'x G Re' is decidable as well. 

290 Part 2: If the predicate 'x G Re' is decidable then there exists a coding C such that 

291 4>e{^ + 1) > s-ll ^• 

292 We can construct the required computable bijection between population states and an 

293 appropriate coding as follows. First, take any effective coding of population states. By 

294 hypothesis 'x G Re' is decidable and therefore we can proceed through the population 

295 states, X, in increasing order, applying the following algorithm: 

296 (i) ii X ^ Re and it is the k^'^ such state up to that point, use the /c*'* odd number as its 

297 new code. 

298 (ii) if X G Re, calculate ^ii{(t)E{i) = x), and use the i^^ even number as its new code. 

299 Thus, -R^ is the set of even numbers, and they are visited in increasing order as evolution 

300 proceeds. In particular, using CC~^ to denote the above mapping described in points (i) 

301 and (ii), where is the inverse mapping of the coding that generated x (i.e., it takes 

302 code X and returns the corresponding population state, s), we have 

303 0^(n + 1) = CC^^^iEiji + 1) = 2(n + 1). The last equality follows from the fact that 

12 



304 CC^^(j)E{n + 1) determines the time at which state (pEi^ + 1) occurs (which is n + 1), and 

305 assigns it a new code equal to twice this value (point (ii) above). Therefore 

306 0^(n + 1) > (Psi^) 

307 Q.E.D. 



308 Discussion 



309 This article has two main goals. The first goal is to highlight the distinction between 

310 open-ended versus closed models of evolution, and to suggest that open-ended models 

311 might better capture real evolutionary processes. The second goal is to explore the extent 

312 to which the development of a predictive, open-ended theory of evolution is possible. The 

313 above theorem illustrates that there is an interesting connection between this question and 

314 analyses from computability theory and mathematical logic. It also draws a formal 

315 connection between the extent to which such a theory is possible and the notion of 

316 progressive evolution. 

317 Because the theorem states an equivalence relationship between the possibility of 

318 developing a negation-complete theory and progressive evolution, it can be read in two 

319 distinct ways. First, it states that if evolution is progressive then a negation-complete 

320 theory is possible. This is, perhaps, not too surprising. If evolution is progressive then 

321 there would be a good deal of regularity to the process that one ought to be able to exploit 

322 in constructing theory. The second way to read the theorem is from the perspective of the 

323 reverse implication. This is somewhat more surprising; it states that if evolution is not 

324 progressive then a negation-complete theory will not be possible. 



hese results rest on the fact that digi tal inheritance allows evolution to be open-ended 



326 (IMaynard Smith and Szathmary 



19951 ). If, instead, the hereditary system allowed for only 
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327 a finite number of discrete possible types, tlien evolution would either display periodic 

328 behaviour or would reach an equilibrium (possibly with stochastic fluctuations; Appendix 

329 [5]). A negation-complete theory of evolution would then be trivially possible in such cases 

330 because, in principle, we could simply develop a finite list of all evolutionary outcomes that 

331 can occur (as described in the influenza example earlier). 

332 Of course, despite the existence of digital inheritance, there is nevertheless presumably a 

333 bound on the number of population states possible for a variety of reasons. Even so, 

334 however, the combinatorial nature of digital inheritance means that the number of possible 

335 population states might be considered effectively infinite. An analogous theorem can be 

336 proven in such cases by replacing the notion of infinite with a precise notion of effectively 

337 infinite instead (Appendix |6]). Likewise, although the main results of the text assume that 

338 evolution is deterministic, an analogous theorem holds that accounts for the inherently 

339 stochastic nature of the evolutionary process (Appendix [5]). 

340 The notion of progressive evolution is somewhat slippery, and there does not exist a 

341 general yet precise definition of progression that is universally agreed upon. As a result, 

342 this has led to disagree r nent o ver the extent to which progressive evolution occurs 



343 ( jPawkins 



1997 



Gould 



19971 1. A complete discussion of the idea of progressive evolution is 



344 beyond the scope of this article but a few points are worth making here. 

345 Most discussions of progressive evolution involve quantities like mean fitness, body size, 

346 complexity, or other relatively conspicuous biological measurements. Many such discussions 

347 also are retrospective in the sense that they look at historical patterns when attempting to 

348 find patterns of progression. But both of these aspects of discussions of progression are 

349 problematic. First, although it would be nice to readily identify some obvious, and 

350 biologically meaningful, characteristic of a population that changes in a directional way, 

351 there is no reason to expect that we have currently thought of all the possibilities. Thus, 

352 when defining progression, it would seem desirable to do so in a very general way, leaving 
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353 open the possibility that some biologically interesting, but as yet undiscovered quantity 

354 increases over time. Second, looking toward historical patterns for definitions of 

355 progression is essentially looking at data and then designing an hypothesis to fit. 

356 Progression ought to be defined prospectively rather than retrospectively, meaning that it 

357 ought to have predictive value; if evolution is progressive, then we ought to be able to 
35B define, a priori, a quantity that will increase. 

359 The definition of progression used here was purposefully chosen to deal with the 

360 above-mentioned difficulties. Thus, as it stands, it necessarily is not linked to any specific 

361 biological measurement. By the definition used here, the quantity that might increase over 

362 time need not have any obvious biological interpretation outside of the role that it plays in 

363 progressive evolution. This level of generality seems desirable if we are asking questions 

364 about the existence of such a quantity without necessarily knowing anything specific about 

365 what it might be. Such generality does mean, however, that if evolution is progressive in 

366 this sense, then the progressive trait might well be some highly complicated characteristic 

367 of the population that does not necessarily correspond to any biological attribute of an 

368 organism that is a priori natural. In this way, some readers might prefer to view the 

369 theorem presented here as a definition of progressive evolution rather than as a statement 

370 about the limitation of theory. In other words, we might define progressive evolution as an 

371 evolutionary process for which we could, in principle, construct a negation-complete 

372 evolutionary theory. The theorem then says that this definition is equivalent to there 

373 existing some quantity that increases over evolutionary time. 



374 Decidabi 



Franzen 



i ty res ults, such as those presented here, are often prone to misinterpretation 



20051 ). Therefore it is important to be clear about what the above theorem says 

376 as well as what it does not say. First, the theorem does not imply that developing a 

377 predictive theory of evolution is impossible. A very large portion of current research in 

378 evolutionary biology is directed towards developing such predictive capacity and therefore 
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379 the theorem takes the existence of such a theory as a starting point. The rationale is to 

380 determine whether there might still be other, inherent, limits on the kinds of questions that 

381 can be answered even if we are successful in pushing the development of current research in 

382 this direction. The theorem demonstrates that there are such inherent limits, and in 

383 essence the problem arises from a difficulty in predicting the places that evolution does not 

384 go. In other words, although a predictive theory can always be used to map out the course 

385 of evolution, interestingly, it cannot always be used to map out the courses that evolution 

386 does not take. The theorem presented here, in effect, demonstrates that doing the latter is 

387 not possible unless evolution is progressive. 

388 How are these considerations to be interpreted in the context of examples like that of 

389 influenza evolution discussed earlier? First, as already mentioned in that example, the 

390 analysis would begin by taking what is essentially a best-case scenario, and supposing that 

391 we have enough knowledge of the system to develop an open-ended model that perfectly 

392 predicts (possibly in a probabilistic way) the genetic composition of the influenza 

393 population in the next time step, as a function of its current composition. Then we ask, is 

394 there a significant probability that another flu pandemic with the 1918 strain will ever 

395 occur? The above theorem states that, even if we had such a perfect model, this kind of 

396 question is unanswerable unless influenza evolution is progressive. In other words, unless 

397 some characteristic of the influenza population changes directionally during evolution (e.g., 

398 some aspect of the antigenicity profile changes directionally) such a prediction will not be 

399 possible. Moreover, this limitation arises because, even though we can use our perfect 

400 model to map out the course of influenza evolution over time, this need not be enough to 

401 map out the parts of genotype space that influenza will not explore. 

402 The above limitations apply to predictions about the genetic evolution of the population, 

403 but what if we are interested only in phenotypic predictions? For example, could we 

404 predict whether or not an influenza pandemic similar in severity to that of 1918 will ever 
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405 occur again, regardless of which strain(s) cause the pandemic? Likewise, could we predict 

406 whether or not resistance to antiviral medication will ever evolve, regardless of its genetic 

407 underpinnings? If the genotype-phenotype map is one-to-one, then predicting phenotypic 

408 evolution will be no different than predicting genotypic evolution. Even if many different 

409 genotypes can produce the same phenotype, however, predicting phenotypic evolution still 

410 involves predicting whether or not certain subsets of genotype space are visited during 

411 evolution. As a result, all of the aforementioned limitations should still apply to such cases. 

412 The only exception is if the gcnotypc-phcnotypc map resulted in the dimension of 

413 phenotype space being finite even though the dimension of the genotype space was 

414 effectively infinite. Even in this case, however, the above limitations to prediction would 

415 still apply unless phenotypic knowledge alone was sufficient to predict the state of the 

416 population from one time step to the next (i.e., if we didn't need to consider genetic state 

417 to understand evolution). While this might be possible for some phenotypes of interest, it 

418 seems unlikely that it would be possible for all possible phenotypes. 

419 One might argue, however, that some patterns of phenotypic evolution arc very predictable. 

420 For example, the application of drug pressure to populations seems inevitably to lead to 

421 the evolution of resistance to the drug. How are these sorts of findings reconciled with the 

422 results presented here? First, although the evolution of resistance does appear to be 

423 somewhat predictable, we must distinguish between inductive versus deductive predictions. 

424 One reason we feel confident about predicting the evolution of drug resistance is that we 

425 have seen it occur repeatedly. Therefore, by an inductive argument we expect it to occur 

426 again. Such inductive predictions are conceptually similar to extrapolating predictions 

427 from a statistical model beyond the range of data available. On the other hand, deductive 

428 predictions are made by deducing a prediction from an underlying set of principles or 

429 mechanistic processes. In a sense, inductive predictions require no understanding of the 

430 phenomenon in question whereas deductive predictions are based on some underlying 

431 model of how things work. The results presented here apply solely to deductive predictions. 
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432 A second possibility with respect to the evolution of things like drug resistance, however, is 

433 that evolution is progressive (at least at this 'local' scale). For example, it might well be 

434 that if we formulated an accurate underlying model for how influenza evolution proceeds in 

435 the presence of antiviral drug pressure, there would be some population-level quantity that 

436 changes in a directional way during evolution. Indeed it seems plausible that it is precisely 

437 this kind of directionality that makes us somewhat confident we can predict evolution in 

438 such cases. It should be noted, however, that even if evolution if not progressive the 

439 theorem presented here does not rule out the possibility that some predictions can be 

440 made. For example, it is entirely possible that a theory could still be developed to make 

441 negation-complete predictions about the evolution of drug resistance. The theorem simply 

442 says that it will not be possible to make negation-complete predictions about any arbitrary 

443 aspect of evolution unless the evolutionary process is progressive. 

444 As already mentioned, all of the results presented here begin with the assumption that we 

445 can develop a theory to predict evolution from one time step to the next. Whether or not 

446 current theoretical approaches can be pushed the point where this is true remains a 

447 separate, and open, question. There are certainly consi derable obstacles to doing so unless 



Ibarra et al 



(I2OO2I)). In addition to 



44B the evolutionary system of interest is very simple (e.g. 

449 the problem that historical contingencies raise, the role of uncertainty in initial conditions, 

450 much like those in weather forecasting, might preclude long-term predictions (although 

451 probabilistic statements might still be possible). This remains an important and active area 

452 of research on which the theorem presented here offers no perspective. Rather it simply 

453 reveals that, in the event that theory is eventually developed to do so, it will still face 

454 inherent limitations on the kinds of questions it can answer unless evolution is progressive. 

455 Although a negation-complete theory for the entire evolutionary process of interest is not 

456 possible unless evolution is progressive, this also does not preclude the possibility that a 

457 perfectly acceptable, negation-complete, theory might be developed for short-term and/or 
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45B local predictions. Indeed, just as similar inherent limitations in computability theory and 

459 mathematical logic have not prevented people from making astonishing progress in these 

460 areas of research, so to is the case for evolutionary biology. As mentioned in the 

461 introduction, many theoretical advances have already been made by focusing on subsets of 

462 the space of potential evolutionary outcomes. Continuing to push theoretical development 

463 in this direction by broadening the space considered will be possible regardless of the 

464 nature of the evolutionary process. The theorem does imply, however, that unless evolution 

465 is progressive, it will not be possible to encompass all such developments within a single 

466 unified set of principles from which all negation-complete evolutionary predictions can be 

467 drawn. 



468 There are some previous theoretical results in the literature that consider the extent to 

469 which evolution exhibits a directional tendency and it is useful to consider how the present 

470 results relate to these previous works. For example, it has been shown previously with 

471 quite general stochastic models of evolution that a quan tity termed 'free fitness' is always 



19881 ). The analysis, however, did not 



472 non-decreasing during evolutionary change ( llwasal . 

473 allow for open-ended evolution because the state space was assumed to be finite, and the 

474 Markov model used was (implicitly) assumed to be positively recurrent. As a result, a 

475 unique stationary distribution existed and thus continual evolution was precluded. 



19881) do 



476 It might be reasonably argued however that, although analyses such as ( llwasal . 

477 not allow for truly open-ended evolution, if the state space is large enough, and if the 

47B transient dynamics are long enough, then it is effectively an open-ended model. As such, 

479 should not the results with respect to free fitness still apply? In other words, does this not 

480 then suggest that there is some quantity (free fitness) that increases during evolution, and 

481 thus that a negation-complete theory is possible? The answe r is no , and the reason is 



482 subtle but important. The definition of free fitness in ( llwasa 



19881 ) ■ lik e other quanti ties 



483 that have been suggested to change directionally during evolution (e.g.. 



Adami et al 
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484 (120001 )) are based on measures closely related to entropy. Importantly, the mapping 

485 between these measures of entropy and population states is not one-to-one because there 

486 are many (indeed, potentially infinitely many) biologically distinct population states that 

487 have the same value of entropy (or the same value of 'free fitness'). As a result, even 

488 though measures like free fitness might not decrease during evolution, an indefinite amount 

489 of biologically interesting and significant evolutionary change can still occur without any 

490 change in free fitness. Roughly speaking, although measures related to things like entropy 

491 provide an interesting physical quantity that might change direct ionally, the relationship 

492 between entropy and quantities that are of biological interest need not be simple. 



493 In a similar vein one might argue that, because biological evolution takes place within a 

494 physical system that is subject to the Second Law of Thermodynamics, ultimately a 

495 general measure entropy must provide a directionality to the system. Again, while this is 

496 true is terms of the system as a whole, the mapping between entropy and the population 

497 states of biological interest is not one-to-one. Thus, even though the total entropy of the 

498 entire physical system must always increase, the entropy of any component part (e.g., the 

499 biological part of interest) need not change in this way. 



500 What do all these considerations have to say about how the process of evolution is studied, 

501 or how current theoretical research is done? Should evolutionary biologists care about such 

502 results? For instance, do the results point to new ideas that might help us do theory 

503 better? Although there is no single answer to this question, there are two points worth 

504 making in this regard. First, the distinction between open and closed-models seems like a 

505 useful, and currently somewhat under-appreciated, way to categorize models of evolution. 

506 As such it does suggest some new directions in which evolutionary theory might be taken, 

507 particularly given that open-ended models are sometimes amenable to asking novel, and 



508 potentially v ery important, evo 



509 models (e.g.. 



Font ana and Buss 



utiona r y questions that canno t be addressed with c 



f ll994h 



Fontana and Schusterl ( Il998bl ): 



Lenski et al 



osed 



fll999h : 
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Stadler et al. 


(2001) 


Lenski et al. 


(2003) 



Yedid and Bell 


( 


2001 


); 


Wilke et al. 


M M 

Chow et al.i (i2004i): lOstrowski et al. 



( 2001 ): lYedid and Belli tooi \: 



(120071 )). Second, to the extent 



512 that one cares about developing theory for open-ended evolutionary processes, the theorem 

513 presented here then reveals that there is an inherent 'upper bound' on how far we can push 

514 the predictive capability of such theory. In particular, although such theory opens the door 

515 to asking new evolutionary questions, unless evolution is progressive, there will remain 

516 some such questions that are unanswerable. Furthermore, although it will likely be difficult 

517 to use the theorem as a means of proving that evolution is progressive (i.e., by developing a 
51B negation-complete theory) or to use the theorem to prove that a complete evolutionary 

519 theory is possible (i.e., by determining that evolution is progressive) the result does 

520 nevertheless reveal that these two important, and somewhat distinct, biological ideas are 

521 fundamentally one and the same thing. 

522 My intention was not to imply that the theorem could be used to determine decidability 

523 from knowledge of progression, or the reverse. Rather, it was to prove (within the set of 

524 assumptions used) that decidability and progression can be viewed as one of the same 

525 thing. 



526 The theorem presented here has close ties t o Code 



527 axiomatic theories of the natural nurabers (jGodel 



1965 



van Heijenoort ed. 



1967 



Smith . 



s In c ompleteness Theorem for 



1931 



Nagel and Newman 



1958 



David . 



20071 ). An axiomatic theory consists of a set of 



529 symbols, a logical apparatus (e.g., the predicate calculus), a set of axioms involving the 

530 symbols, and a set of rules of deduction thr ough which ii ew statements involving the 



531 symbols can be derived (termed 'theorems' 



SmithI ( 120071 )). Given such a system, theorems 



532 can be derived through the repeated algorithmic application of the rules of deduction. 



533 In the early 1900 's there was a concerted attempt to produce such an axiomatic theory 



534 that was meant to represent the natural numbers, with the 



535 statements about the natural numbers, and no false ones; ( IWhitehead and Russell , 



proviso that it yield al 



true 



1910 
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Davis , 



1965 



Smithl . 120071 ) . Coders Incomple t eness Theorem (|Godell . Il931t iNagel and Newmanl . Il958 



van Heijenoort ed. 



1967 



Smith 



20071 ). however, revealed that this is 



impossible for any axiomatic system sufficiently rich that it can make simple 
number-theoretic statements. For example, it shows that if the axiomatic system is rich 
enough that it can express the number-thoeretic statement corresponding to the predicate 
€ -Rp'i the n it cannot produce all true number-theoretic statements and no false ones 



542 ( ISmithl . 



20071 ). For if it could, then it could always produce the number theoretic statement 
corresponding to either 'x G Re^ or 'x ^ Re^ as a theorem, because one of the two must be 
true. But if it can do this, then it provides an algorithmic procedure for deciding the 
predicate 'x G Re\ and we know that this is not always possible as the results presented 
here illustrate. 



547 The Halting Problem from computability theory (ITuringl . Il936l : ICutlandl . Il980[ ) is also 

548 intimately related to the results presented here. As already detailed, the question of 

549 whether a population state is evolutionarily attainable is equivalent to the question of 

550 whether a given positive integer is in the range of a particular computable function. 

551 Moreover, this latter question is directly connected to the analogous question of whether a 

552 given integer is in the domain of a computable function (i.e., whether, given a particular 

553 integer input, the function returns a value in finite time). The latter problem is precisely 

554 the Halting Problem, and it is known that there is no general al gorithra i c pro c edure for 



555 solving the Halting problem for arbitrary computable functions ([Turing, 



1936 



Cutland 



1980). 



557 As mentioned earlier, in a very general sense, the results presented here are applicable to 

55B any system that can be faithfully described by a Markov dynamical system over an infinite 

559 set of discrete possibilities (i.e., an open-ended dynamical system). Therefore, one might 

560 ask whether there is anything in the results presented that is particular to evolution per se? 

561 In one sense the answer is 'no', but therein lies the power of such mathematical abstraction; 
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562 
563 
564 



it reveals the underlying, key, structure of the process. Evolution will be an open-ended 
dynamical system whenever heredity is indefinite, and it therefore shares a fundamental 
similarity with all other processes that are also such open-ended dynamical systems. 



565 At the same time, however, the results do have special significance for evolution. There are, 

566 perhaps, relatively few other kinds of processes of interest that share the property of being 

567 such an open-ended dynamical system in a meaningful way. For example, a great many 

568 processes of interest have a relatively small space of potential outcomes, and are thus 

569 clearly not open-ended. Furthermore, for those processes that are potentially open-ended, 

570 it is sometimes of little theoretical interest to distinguish among all possible outcomes, and 

571 therefore the space of relevant outcomes can still be relatively small. Moreover, even when 

572 the space of potential outcomes of interest truly is open-ended, some processes (e.g., some 

573 physical processes) obey simple enough dynamics that such negation-complete predictions 

574 can readily be made (i.e., the system is 'progressive' is the sense considered here). Thus, 

575 the limitations detailed by the theorem are of interest, primarily for those processes that 

576 are both open-ended, and that are complex enough that the question of progression is 

577 unresolved (Appendix H]) . Evolution under indefinite heredity might be a somewhat unique 

578 process in satisfying both of these criteria. 

579 There are, however, other processes of interest for which such decidability results might be 

580 of interest. After all, in an important sense, biological evolution is nothing more than the 

581 emergent properties of physics and chemistry. In fact such limitations on theory have been 



582 discusse d previous 



583 physics fiHawking 



y, pa rticularly as they relate to the so-called theory of everything in 
20021). It is proba bly safe to say that no general concensus on this issue 



584 has yet been reached ( 



Franzen . 



20051 ): however, the theorem presented here has 

585 implications for any physical or chemical theory that aims to explain evolutionary 

586 phenomena. It demonstrates that a rational, deductive, approach to such theory will 

587 necessarily face some inherent limitations on the answers that it can provide. 
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698 Figure 1: A schematic representation of the coding of population states, and the theorem. 

699 Middle irregular shape represents the space of population states, S, with four states 

700 depicted (the ovals). Roman numerals indicate the time when each state is visited during 

701 evolution (silver-shaded state, s — {T, T, T}, is never visited). Vertical ovals on right and 

702 left represent two different codings by the positive integers, along with their respective 

703 evolutionary mappings, 0b (n) and (f)^{n), over the first three time steps. If evolution is 

704 progressive, then Coding 2 is possible, and the theorem then says we can 'decide' any 

705 population state, s & S. For example, we can decide state 'T,T,T' by finding its code (i.e., 

706 '1'), and then iterating the map, (f)^{n), until we obtain an output greater than '1' (this 

707 occurs at time step 1 because 0£;(1) = 2). If '1' has not yet been visited by this time, it 

708 never will be. Conversely, if all population states are decidable, then under Coding 1 we 

709 can apply the algorithm provided in Part 2 of the theorem's proof to obtain Coding 2, 

710 thereby demonstrating that evolution is progressive. 
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711 Appendices 



1 Theory 



713 The term 'theory' is used in a technical sense. A theory consists of a set of symbols that 

714 constitute the langu age of the the ory, a set of premises which are taken as given, and a set 



715 of rules of inference ((Smith 



20071 ). The symbols represent certain components of reality, 

716 and the premises constitute statements about reality through the interpretation of the 

717 symbols. The rules of inference then constitute valid ways of deducing new statements 

718 about the symbols of the language, and thus through interpretation, new statements about 

719 reality. Thus, within such a theory, statements are derived by taking some premise(s), and 

720 applying the rules of inference. 

721 Statements derived through a series of deductive arguments using the rules of inference are 

722 referred to as theorems of the theory. The result of the main text is valid for any 

723 evolutionary theory whose theorems are recursively enumerable (Appendix [2]) ; i.e., any 

724 theory whose theorems can be derived through the use of a finite (but possible large) 

725 number of mechanical, or algorithmic, steps (e.g., as laid out in the rules of inference; 

726 Appendix [2]) . This is clearly true for any such theory based on computatio n, since 



727 computers do nothing more than mechanically follow rules ( ICutland 



1980). It is also true 



for any axiomatic theory, since the theorems of any such theory can be derived simply by 



729 applying the mechanical rules of inference to the axioms (jSmithj, 



20071 ). 



730 A great deal of current quantitative theory in evolutionary biology fits the above template. 

731 For example, current theory often abstracts reality mathematically by assigning formal 

732 symbols to things like allele frequencies and population sizes. A set of premises is then 

733 taken, for example, by formalizing an hypothesis about how genotypic fitnesses are 

734 determined. Next, a finite number of applications of 'rules of inference' are used (e.g., the 
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735 application of certain mathematical operations) in order to derive statements about the 

736 formal symbols of this theory. Finally, these symbolic statements are then interpreted 

737 again in terms of their biological meaning, and hence predictions about evolution are made 
73B (Fig. SI). 

739 Figure SI: A schematic representation of the relationship between the biological process of 

740 evolution and theory. The example given illustrates classical population- genetic theory. A 

741 formal system is created to represent elements of evolution (e.g., p{t) represents the number 

742 of the blue genotype at time t). A set of premises is specified (e.g., initial genotype 

743 numbers, how genotypic fitnesses are determined, etc. - this is embodied by the mapping 

744 F). Rules of deduction are then followed (e.g., repeated application of the mapping F) to 

745 obtain new statements about elements of the formal theory (e.g., p(l),p(2),p(3) etc.). 

746 These new elements are then interpreted in terms of evolution (e.g., as predictions about 
Tn genotype numbers at future times). 



2 Some results from computability theory 



749 A function is computable if it can be evalua ted by an Unlimited Register Machine (URM) 



750 in a finite numbers of steps ( ICutland 



1980). The Church- Turing Thesis states that any 



751 function we might vie w as being eva. 



752 evaluated by a URM ( ICutlandl . 



uated through a mechanical procedure can be 
1980l ). Thus, given the Church- Turing Thesis, the easiest 

753 way to ascertain whether something is computable is to consider whether a computer could 

754 be programed to do it in such a way that an output is guaranteed, in a finite (but possibly 

755 very large) number of steps. 

756 Definition: A function is total if it is computable over all natural numbers. 



757 Definition: A function is partial if it is computable only over some (nonempty) subset of 
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758 the natural numbers. 



759 Definition: A set is denumerable if there exists a bijection between it and the natural 

760 numbers. 

761 Definition: A set is effectively denumerable if this bijection, and its inverse, are 

762 computable. 

763 Definition: The characteristic function of a set of natural numbers. A, is 



1 ifn e A 

CA{n) = { (1) 
ifn^^ 



764 Definition: The predicate 'n G A' is decidable if its characteristic function is computable. 

765 Definition: The set A is recursive if the predicate 'n e A^ is decidable. 

766 Definition: The partial characteristic function of a set of natural numbers. A, is 



1 line A 

CA{n) = { (2) 
undefined ii n ^ A 



767 Definition: The predicate 'n e A'' is partially decidable if its partial characteristic 

763 function is computable for n G A. 

769 Definition: The set A is recursively enumerable (denoted r.e.) if the predicate 'n e A is 

770 partially decidable. 
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771 
772 
773 



Note that every recursive set is r.e. but not vice versa. Furthermore, a set A is recursive if, 
and only if, both A a nd its compleni ent A'^ are r.e.. Finally, note that any finite set of 
numbers is recursive (jCutlandl . llQSOl ). 



774 The following concepts and notation will also prove useful: 

775 First, because any computable function can be evaluated through a series of steps, we can 

776 define c^(n) as the value of c^(n) after the o*'' step in its evaluation. In particular, c^(n) 

777 evaluates to 'null' if it has not returned a value by the o*^ step. 

778 Second, a standard result from computability th eory demonstra tes that there exists a 



779 computable bijection between N"*" and x N"*" (jCutlandl . 119801 ). We will denote this 

780 mapping by 5 : n i— )■ (Ti(n), T2{n)). 

781 Third, the notion of an 'unbounded search' is central in computability theory. In 

782 particular, it is standard to use the notation ^y{f{y) = k) to denote 'the smallest value of 

783 y such that fiy) = k\ 

784 Fourth, a fundamental theorem of co mputability theo ry demonstrates that the set of all 



1980l ). Thus, we can use 0fc(n) to denote 



785 computable functions is denumerable (jCutlandl . 

786 the k^^ computable function, and Rk and as its range and domain respectively. We will 

787 also make use of the notation Rk{n) = {x : (j)k{i) = x,i < n}. In other words, if (pkii^) is 

788 evaluated for increasing values of n, then Rk{n) is the subset of the range of (j)k{f^) that has 

789 been visited by step n. This is clearly computable for any n if (f)k{n) is total. 

790 Finally, notice that it was implicitly assumed that the mapping, G corresponding to the 

791 evolutionary process is computable, and thus E{n) is a computable function. Thus, the 

792 evolutionary process is, in an important way, nothing other than computation. Although it 

793 is not practically feasible to verify or refute this assumption for most evolutionary systems, 

794 there are very good reasons to expect that this assumption is reasonable. First, if we are 
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795 willing to view the processes occurring in our biological system as being purely 

796 'mechanical', then we can appeal to the Church- Turing Thesis to argue that G must 

797 thereby be computable. Second, the use of the term 'evolution', as a process, should not be 

798 restricted to a particular instantiation of this process, as for example occurs in 

799 carbon-based life. For example, there are very good reasons to think that the processes 

800 occurring in in silico evolution are fundamentally the same as those occurring in biological 

801 evolution. As such these would clearly be computable. Finally, even if biological evolution 

802 isn't formally computable (i.e., it is not mechanical) we nevertheless usually proceed by 

803 assuming that it can be modeled using computation. 



804 3 The set of population states is effectively 

805 denumerable 

806 Here we prove that the set of possible population states is effectively denumerable; i.e., 

807 that there exists a computable bijection between the population states and the positive 

808 integers with a computable inverse. Such sets are also called effectively denumerable. 

809 Proof: We simply need to demonstrate an effective procedure (i.e., a computable procedure) 

810 for both encoding and decoding the population states into positive integers. Let M be the 

811 maximum possible population size (a positive integer). Each of the M 'slots' is either 

812 vacant, or filled by an individual that is completely characterized by its DNA sequence. 

813 Furthermore, we can set A=0, C—1, G=2, T=3, and then read the DNA sequence from its 

814 5' to 3' end, thereby establishing a unique characterization of each slot in the population. 

815 (A) Encoding: For each of the M slots calculate a numeric code as follows: Reading the 

816 DNA from its 5' to 3' end, for the n*^ base, take the n*'* prime number and raise it to the 

817 power corresponding to this base as listed above. Multiply all these numbers together. 
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81B This gives a unique number for each distinct DNA sequence, and thus the mapping is 

819 injective. Furthermore, since all positive integers greater than or equal to 2 have a unique 

820 prime factorization, all such integers correspond to a DNA sequence. Thus, if we code the 

821 state 'vacant' with the number 1, the mapping is surjective as well. Furthermore, this 

822 procedure is computable for any piece of DNA. This shows that there is a computable 

823 encoding for each slot, and since the population is simply the union of a finite number of 

824 such slots, the population state has a computable encoding as well. In particular, the 

825 coding of each slot locates a point in x ■ ■ ■ x (where appears M times) that can 

826 be uniquely identified by its indices. One can then cycle through all possible indices as 

827 follows: start with all indices that sum to 1, then those that sum to 2 etc. This is 

828 computable, and for each instance we simply assign a code number in increasing order. 

829 (B) Decoding: For any given code number, cycle through the sets of indices as above, 

830 stopping once the code number is reached, and determine those indices. Once these indices 

831 have been obtained, one can determine their corresponding DNA through their prime 

832 factorization. 



4 Some additional technical information about the 



theorem 



835 The theorem of the text would be of little interest if it were never possible for 'x G Re^ to 

836 be undecidable. It is well-known in computability the ory that there exist computable 



19801); Appendix iD, but the 



837 functions for which such predicates are undecidable (( ICutlandl . 

838 evolutionary process considered represents a special kind of computable function. In 

839 particular, it must satisfy the mapping + 1) = G(0fc(n)) for all ra, where (?() is a 

840 computable function with appropriate domain. The subset of computable functions 

841 satisfying this relation will be referred to as Markov, total, computable functions. 
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842 This section presents a series of three lemmas that, together, demonstrate that there do in 

843 fact exis t Mark o v computable functions for which 'x G Re^ is undecidable (see also 



Cutlandl (1198 



Smith! (j2007l )). In such cases, the set of evolutionarily attainable states, 

845 Re will be called 'recursively enumerable' (r.e.; because 'x G Re^ is always at least 

846 partially decidable for Markov computable functions). On the other hand, if 'x G Re^ is 

847 decidable, then Re is said to be 'recursive' (Appendix [2] and Appendix Hj). 

848 Lemma 1: A set of numbers is recursively enumerable if, and only if, it is the range of 

849 some total, computable, function. Note: we could relax the 'total' requirement without much 

850 change. 



Proof: (i) A r.e. =^ 'A is the range of a total computable function' 



852 Given A is r.e., the partial characteristic function of A is computable; i.e. 



11 ifn e A 

(3) 
undefined ii n ^ A 

853 is computable. Now first choose an a G A. This is a computable operation since we can 

854 simply use the bijection B : n {Ti{n),T2{n)) to evaluate ci^^^\Ti{n)) for increasing n 

855 until it returns a value of 1, and then identify the corresponding value Ti{n). Next, we can 

856 define the computable function 



X if c'\{x) = 1 

g{x,o)={ (4) 
a otherwise 
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857 Then, again we can use the computable bijection B : n ^ (Ti{n), T2{n)) to define 

858 f{n) = g(Ti{n),T2{n)y This is a total computable function with range equal to A. 

859 (ii) '-Rfc is the range of a total computable function' =^ Rj. r.e. 

860 Consider the total function (pkin). We can then construct the computable partial 

861 characteristic function for as follows: For any input value, x, output the value 1 after 

862 evaluating fj,i{(j)k{i) = x). 

863 Q.E.D. 

864 Given Lemma 1, we can then prove the following, second, lemma; 

865 Lemma 2: There exists total computable functions whose ranges are r.e. but not recursive. 

866 Using Lemma 1, we can prove Lemma 2 by proving that there exist sets that are r.e. but 

867 whose complements are not r.e. 



868 Proof Sketch (by construction) see lSmitn ^20071) : 



869 We will demonstrate that K = {n : n & is one such set. It is clear, therefore, that 

870 other such sets can be constructed as well. 



871 First it can be proven that K'^ is not r.e. using Cantor's diagonal argument (e.g., see [Smith 



872 ( 120071 )). In particular, since all r.e. sets are the range of some computable function, and 

873 since the computable functions are denumerable, the set of all r.e. sets is denumerable. So 

874 we simply need to construct a set that is not in this list. Choosing numbers n such that 

875 n ^ Rn satisfies this property, and this is exactly K'^. 

876 All that remains then is to show that K is r.e. As with characteristic functions, all 

877 computable functions are evaluated through a series of operations for each input, and 

878 therefore we can consider the o*'^ operation of any computable function. Therefore, define 
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14>n{x) if 4>n{x) halted by operation o in its evaluation 
(5) 
n + 1 otherwise 

879 This is a computable function. Now we can use the bijection B : n ^ {Ti{n),T2{n)) to 

880 define f{z,n) = g(Ti{z),T2{z),nj. This is also computable, and for any given n and z it 

881 outputs either n + 1 or else an element of We can then construct the computable 

882 partial characteristic function for K as follows: For any input value, n, output the value 1 

883 after evaluating ^z{f{z,n) = n). 

884 These results show that there exist computable functions whose ranges are r.e. but not 

885 recursive. Note that some such functions might have the same output values for more than 

886 one value in their domain, but these cannot be Markov computable functions. The reason 

887 is simply that the mapping G ensures that, if Re is infinite, then (j)E{i^) can never repeat 

888 itself as n increases (see Lemma 1, Appendix [5l) . Therefore, we still need to demonstrates 

889 that, even if we restrict attention to Markov computable functions, some such functions 

890 have r.e. ranges that are not recursive. This is done in the third lemma: 

891 Lemma 3: For every total computable function having a range that is r.e. hut not recursive, 

892 there exists a total computable Markov function with the same range. 

893 Proof: Suppose that 0A:(n) is total and has an r.e. range that is not recursive (and thus Rk 

894 is infinite). Define the computable function (f)^{n) = (j)k{z{n)), where 

395 z{n) = ^i{(f)k{i) ^ Rk{n — 1)). It is clear that (f)j^{n) is a total, computable function with 

896 range Rk- Now we simply need to show that 4>j^{n + 1) = G(^(j)f,{n)) for all n for some 

397 computable (?(). By construction we can see that the computable function 

393 G{y) = (l)j^{fiz{(f)j^{z) = y) + 1) works, where its domain is R^. This function takes a state y, 

899 finds the unique time at which this state occurs (i.e., ^z{(j)j^{z) = y) - this is computable). 
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900 and then adds 1. The resulting value is then used in the function (f)j^{n) to compute the 

901 state in the next time step. In particular, we can see that =0^(n+l). 

902 Q.E.D. 

903 5 Continuous Time Stochasticity 

904 For simplicity of exposition, all results of the main text have assumed that the evolutionary 

905 process is deterministic and that generations are discrete. Here we show that an analogous 

906 theorem holds if we relax these restrictions. 

907 To begin, it is easy to see that the assumption of discrete generations is immaterial. In 

908 particular, if we take generations to be continuous, then we can suppose that, at any 

909 instant in time, only a single event is possible (e.g., individual birth or death). Thus, 

910 because the state space is discrete, we can simply view the continuous-time process as one 

911 in which discrete events occur at points in time that need not be uniformly spaced. 

912 Allowing for stochasticity requires more work. If the evolutionary process is deterministic, 

913 then there is a single population state possible for each point in time, n. In the analysis of 

914 this case, we supposed that we had complete knowledge, not only of the evolutionary 

915 mapping, G an its initial condition, but of the solution to this mapping, 0E(n) as well (and 

916 it is a total, computable, function). 

917 Now there will be uncertainty in what the population state will be at time n, and in fact 

918 there will potentially be several different states that the population might attain at n. 

919 Some of these might be more likely than others in that, if we replayed the evolutionary 

920 process multiple times, certain states might arise more often than others. Thus we might 

921 imagine a probability distribution over the set of positive integers at each time step, n. By 
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922 analogy with the deterministic case, we make a Markov assumption, meaning that the 

923 probabihty distribution on the population states at any given time, n, depends only on the 

924 population state in the previous time, n — 1. In other words, there is some mapping, H, 

925 from current population state to the probability distribution over the population states in 

926 the next time period. The solution of this mapping (given an initial condition) then gives 

927 the probability distribution over the states at each point in time. 

92B Just as with the deterministic case, we suppose that we have complete knowledge of the 

929 solution of this evolutionary process in the following sense: at any time n, we have a total, 

930 computable function that tells us simply the set of states, at that time, that have positive 

931 support. Thus, we have a total, computable, set-valued function that gives the set of 

932 "feasible" states at time n. The 'tilde' signals that this function is now a set-valued 

933 function, rather than an integer-valued one. And again the goal of a negation-complete 

934 theory would then be to decide whether any given state lies within the set of feasible states 

935 or not. 

936 One objection to this formulation is that we might expect all states have some nonzero 

937 probability, even if it is vanishingly small. As such, under this definition all states would 
93B then be trivially feasible. There are at least two potential responses to this objection. 

939 First, while it is true that many models of evolution assume that all states have nonzero 

mutation-selectio n bala nce, including those 



Kimura and Growl ( 119641 )). this is usually 



940 probability (e.g., many stochastic models o 

941 with an infinite number of different alleles; 

942 because they are 'closed' models in the sense described earlier. In particular they often 

943 assume, for mathematical convenience, that the stochastic process is irreducible and 

944 positively recurrent. This then implies that a unique stationary distribution exists 



945 ( IGrimmett and Stirzaker 



19921 ) and thereby rules out the possibility of open-ended 

946 evolution. Although it is possible to develop a model for open-ended evolution that still 

947 has nonzero probability for all states, it is not obvious that this need be true of real 
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948 open-ended evolution. For example, out of the effectively infinite number of different 

949 nucleotide combinations that could make up a genotype, we might expect at least some of 

950 these to be truly lethal. On a more practical level, given the analysis presented here it 

951 seems reasonable to expect that a similar theorem could be proved if we instead defined a 

952 state as being feasible if it occured with some probabihty greater than a small threshold 

953 value, e > 0. At this point, however, such a theorem remains conjecture. 

954 Given that all of our considerations with respect to computability have been restricted to 

955 integer-valued functions, we now need to make the notion of computability of ^^(n) more 

956 precise. The set- valued function 0£;(n) can be thought of as consisting of two separate 

957 computable functions, each of which is an integer-valued function and so fits within the 

958 notions of computability already discussed. The first function is simply a computable 

959 function as before, whose range is now thought of as the set of feasible population 

960 states. The argument i here is now no longer meant to be evolutionary time, however, but 

961 rather is simply an index whose meaning is described below. The second computable 

962 function we denote by (/)E*{n), and it specifics the number of feasible population states in 

963 generation n in the following way: the set of all feasible population states at time 1; i.e., 

964 is given by 0e(2), where 0_b*(1) = ki. Likewise, 

965 0£;(2) = + 1), + k2)}, where (f)E*{2) = k2, and so on. In this way, we can 

966 apply the same notions of computability to the set- valued function 0b (n) by applying them 

967 to its component, integer- valued, functions ^^(i) and (f)E*{n). We will assume that the set 

968 0E(n) is finite for all n, which guarantees that it be computable. Nevertheless, it seems 

969 reasonable to expect that some formulations in which this set is infinite would still be 

970 computable, and thus would still fit within the results that follow. 

971 As in the deterministic case, we must also specify the initial conditions, in addition to the 

972 mapping, H. Then, in terms of the mapping, H/ii x & 4>E{n,) is a feasible population state 

973 at time n, the set of feasible population states at time n -|- 1 is given by 
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974 (pEin + 1) = {Jxe4>E(.n) supportif(x), where supportif(a;) denotes the set of states for which 

975 H{x) has positive support. The range of (/>E(n) is the set of all states that are feasible at 

976 some time (i.e., it is the range of ^^(i)). Likewise, a state is evolutionarily attainable if 

977 there is some time for which it is feasible. A complete evolutionary theory is one for which 

978 the predicate 'x G -Re' is decidable; i.e., if, given any population state, we can decide 

979 whether it is feasible at some time. 

980 The same definition of progressive evolution can be used in both the deterministic and 

981 stochastic cases. To specify this precisely, we need the following Lemmas; 

982 Lemma 1: In the deterministic case, a new state is visited every time step if, and only if, 

983 evolution is unbounded (i.e.. Re is infinite) 

984 Lemma 2: In the stochastic case, at least one new state is feasible every time step if, and 

985 only if, evolution is unbounded (i.e.. Re is infinite) 

986 Proof is given of Lemma 2 only (Lemma 1 can be proven in an analogous fashion). We 

987 note that, in the remainder of this section, we use the notation Re{ti) to denote the set of 

988 population states that have been visited (i.e., feasible) by step n of the set- valued function, 

989 (j)Ein) (i.e., not step n of ^^(n)). Equivalently, it denotes the range of ^^(i) visited by step 

990 i = ki + k2-\ \- kn- 

991 Proof: 

992 'At least one new state is feasible each time step' =^ 'Evolution unbounded' 

993 This direction of the implication is obvious since, if at least one new state is feasible each 

994 time step, then the fact that 4>E{n) is total implies that Re is infinite. 

995 'Evolution unbounded' =^ 'At least one new state is feasible each time step' 



43 



996 Contrary to the assertion, suppose instead that Re is infinite but that there is some time, 

997 n* at which no new state is feasible. In other words, for some time n*, the set (^^(n*) 

998 satisfies ^^(n*) C RE{n* — 1). The set of feasible states in the next time step is then given 

999 by + 1) = Uxe^B(n*) supportif(x). Furthermore, for each element, x e ^^(n*), 

1000 3nx < n* such that x G (psinx) (from the hypothesis that 4>Ein*) C RE{n* — 1)). 

1001 Therefore, for each element, x e 4>E{n*), we have that supporti?(x) C 4>E{nx + 1), where 

1002 Tlx < n*. Thus, we have 



^E(n* + 1) = IJ supporti/(a;) (6) 

C U Mnx + 1) (7) 
C RE{n*-l). (8) 



1003 Hence, by induction. Re = Rsin* — 1), which is finite, yielding a contradiction. 

1004 Q.E.D. 

1005 Notice that, in the deterministic case, when evolution is unbounded the computable 

1006 function 4)E{i) never repeats a previously attained value as i increases (Lemma 1 above). 

1007 In the stochastic case, however, even when evolution is unbounded, (/)s(i) can repeat 

1008 previously attained values as i increases. The key connection between the two cases is that, 

1009 in the stochastic case, 0E*(n) is such that, when the outputs of 0£;(i) are grouped into 

1010 their corresponding evolutionary generations, each such grouping always contains at least 1 

1011 new feasible state (Lemma 2 above). 

1012 Now, returning to the proof of the theorem, in the deterministic case. Lemma 1 shows that 

1013 a new population state is visited at every time step. And if evolution is progressive, then 



44 



1014 there is some way to recode the populations states such that, the code number of these new 

1015 states that are visited over time increases. Likewise, Lemma 2 shows that at least one new 

1016 population state becomes feasible at every time step, although some visited population 

1017 states might have been visited previously as well. Nevertheless, we still say that evolution 

1018 is progressive if there is some way to recode the populations states such that, the code 

1019 number(s) of the new states that become feasible each time step, increases with time. 

1020 Formally, if we define cr^{ri) — R^{ri) \ R^{n — 1) as the set of newly feasible states in 

1021 generation n, and min cr^(n) as the smallest of these, then evolution is progressive if there 

1022 exists a computable bijection, C, between the positive integers and the population states, 

1023 such that mm a^{n + 1) > mm a^{n) for all n. Since the set R^{n) is finite and 

1024 computable for all n, mm(T^{n) is a total computable function. 

1025 The proof of the theorem then goes through as follows: 

1026 Theorem: x & Re' is decidable (i.e., Re is recursive) if, and only if, there exists a 

1027 computable one-to-one coding of the population states by positive integers, C , such that, for 

1028 the corresponding description of the evolutionary process, (f)^{n), 

1029 mma^{n + 1) > mma^{n) for all n. 

1030 Proof: 

1031 Part 1: 3C s.t. min cr£,(n + 1) > mma^{n) Vn =^ Re recursive 

1032 By hypothesis there exists a computable bijection C such that mma^{n + 1) > mma^{n) 

1033 for all n. Now for any population state, x, in the original coding, let x be the corresponding 

1034 code under bijection C. Define z{x) = iJ,i{mma^{i) > x). Clearly '£ e R^{z{x)y is 

1035 decidable since R^{z{x)) is finite and enumerable. Furthermore x e R^{z{x)) 4^ x & R^ 

1036 owing to the progressive nature of evolution. Therefore, 'x e i?^' is decidable as well. 

1037 Finally, using S denote the set of population states that are evolutionarily attainable, we 

1038 have that x e R^ <^ C~^x e S <^ CC~^x e Re- Noting that, by definition, x — CC'^x, 
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1039 we obtain x e <^ x e Re- Thus, e Re^ is decidable as well. 

1040 Part 2: i?^ recursive =^ 3(7 s.t. min (T£;(n + 1) > min(T£;(n) Vn 

1041 We can construct the required computable bijection to show that evolution is progressive 

1042 as follows. 

1043 Since Re is recursive, we know that 'x e Re^ is decidable. So take the population states, x, 

1044 in order and go down the list using the following algorithm: 

1045 (i) if X ^ Re and it is the k^^ such state up to that point, return the k*^ odd number. 

1046 (ii) if X e Re-i and if it has not yet been assigned a new code number, do the following: 

1047 • calculate ^i{x e (pE{i)) (i-e., the first time that x becomes feasible). 

1048 • calculate crE{i), the entire set of newly feasible states at i. 

1049 • using the notation 1^41 to denote the cardinality of A, assign codes to all of the |crE(i)| 

1050 elements in aE{i), by starting with the — 1)| + 1 even number, up to the 

1051 even number, in any order. 

1052 • move on to the next state in the list. 

1053 Thus, R^ is again the set of even numbers, and the new states that are feasible each time 

1054 step always have larger code values as time increases. In particular, using CC~^ to denote 

1055 the algorithm described above in points (i) and (ii), where is the inverse mapping of 

1056 the coding that generated x (i.e., it takes code x and returns the corresponding population 

1057 state, s), we have mma^{n + 1) = min C C ~^ a E{n + 1) = 2|i?E(n) + 1|. The last equality 

1058 follows from the fact that CC^^aE{n + 1) determines the first time that each element of 

1059 (7£;(n + 1) occurs (which is n + 1 for all such elements by definition), and then assigns the 

1060 codes 2|i?B(n) + 1| up to 2\RE{n + 1)| for these elements. The minimum of these codes is, 
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1061 of course, 2|i?£;(n) + 1| giving min(T^(n + 1) = 2|i?£;(n) + 1|. As a result, 

1062 m.ma^{n + 1) > mina^{n) because is strictly increasing with n (from Lemma 2). 

1063 Q.E.D. 



6 Effectively Infinite Systems 



The simplified system of evolution considered in the main text assumes that the space of 
potential population states is infinite, and focuses on unbounded evolution (i.e., \Re\ = oo). 
One might argue, however, that any real system of evolution is necessarily finite, if only 
because of a potential limit to the constituent elements of the genetic material. There are 
two potential responses to this objection. First, on a philosophical level, although any 
particular evolutionary system might be finite, one might nevertheless want evolutionary 
theory to stand abstractly, independent of any particular instantiation of an evolutionary 
dynamic. This is very much analogous to the fact that, in the context of number theory, 
although one necessarily only ever has to deal with a finite number of things that require 
counting, we nevertheless desire an abstract theory of numbers that does not presuppose 
any finite limitations. And just as such a negation-complete theory of numbers is not 



1076 possib le 



Smith 



Godel 



1931 



Nagel and Newman 



1958 



Davis 



1965 



van Heijenoort ed. 



1967 



20071 ). neither is one for evolutionary biology unless evolution is progressive. 



Second, on a more practical level, it is clear that the digital nature of heredity offered by 
DNA/RNA makes such systems effectively infinite in that the number of possible 
population states is enormous. The remainder of this section makes the notation of 
effectively infinite precise. For simplicity, the focus below is on the deterministic system. 



Recall that, in the \Re\ = oo case, a function is com putable 



and to tal) if it can be 



1083 evaluated in a finite number of steps, for any input (iCutland 



19801 ) (Appendix E]). Thus 
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1084 the predicate 'x G Re^ is decidable if its characteristic function can be evaluated, for any 

1085 input value x, in a finite number of steps. Likewise, the mapping C of the theorem is 

1086 computable if, for any input, it returns a code number in a finite number of steps. 

1087 When \Re\ < oo, however, the predicate 'x G Re^ is always decidable because we can 

1088 always carry out a complete cataloguing of i?^; in a finite number of steps. We simply need 

1089 to successively evaluate 0£;(n) for increasing values of n. According to Lemma 1 of 

1090 Appendix [5l because Re is finite, we will eventually obtain a value that has previously 

1091 been visited, and from that point onward the system will then simply revisit previously 

1092 visited states. 

1093 Although these observations are formally correct, they nevertheless fail to capture the 

1094 important consequences of digital inheritance in finite systems. In particular, the natural 

1095 analogue of computability for such finite systems in the context of indefinite heredity is not 

1096 the requirement that an output be obtained in a finite number of steps. Rather, it is that 

1097 an output be obtained in a finite number of steps, and that this number of steps not exceed 

1098 some finite hound that is independent of the size of the state space, \Re\- For example, with 

1099 this definition for finite state spaces, the predicate 'x G Re^ would be decidable if its 

1100 characteristic function can be evaluated in a finite number of steps, and if this number 

1101 never exceeds some finite bound that is independent of |-R£;|. Thus, regardless of the size of 

1102 \Re\-, we are guaranteed to never need more than a fixed number of computational steps. 

1103 To formalize these ideas, we need to be precise about what it means to consider state 

1104 spaces of different sizes, |-R£;|. We do this as follows. First, consider the infinite state space 

1105 situation used in the main text, where 0s (^) denotes the computable function 

1106 corresponding to the evolutionary process. Next, define the finite state space process by a 

1107 computable function, F^{n), where n = rj + 1 is the first time at which a previously visited 

1108 population state is re- visited, and where F^{n) = (j)E{n) for all n < rj. Note that we have 

1109 T] = \Re\, and thus r] is the state space size. In this way, any given finite state space 
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1110 process is identical to the reference infinite state space process, (pEin), over time until the 

nil point 77 + 1 at which the finite process begins to revisit previously visited states. Thus we 

1112 can consider state spaces of different sizes, 77, with the limiting case of 77 ^ 00 

1113 corresponding to the infinite state space of the main text. We have the following revised 

1114 definitions for the finite case: 

1115 Definition: The predicate ',x G Re^ is *decidable if, for any input x, there exists a T < 00 

1116 such that the characteristic function crj^{x) can be evaluated in no more than T steps, 

1117 where T is independent of 77 (i.e., independent of system size). 

1118 Definition: A one-to-one mapping of the population states by the positive integers, C, is 

1119 *computable if, for any input there exists a T < 00 such that the mapping can be 

1120 evaluated in no more than T steps, where T is independent of 77. 

1121 The main theorem of the text can again be seen to hold when \Re\ < 00 if we use the 

1122 above definitions. In particular, 

1123 Theorem: 'x G Re ' is *decidable if, and only if, there exists an *computable one-to-one 

1124 coding of the population states by a subset of the positive integers, C , such that the 

1125 corresponding description of the evolutionary process, F^{n), satisfies F|(n -|- 1) > F^{n) 

1126 for all n < T]. 

1127 Notice that there is one difference from the main theorem of the text; namely, the altered 

1128 characterization of progressive evolution. Now, because Re is finite, we say that evolution 

1129 is progressive if there is some quantity that increases over time before the process begins to 

1130 repeat. Also note that, in addition to the altered definition of 'computable' and 'decidable' 

1131 in the statement of the theorem, all other instances of computability use this altered 

1132 definition as well. 

1133 Only a sketch of a formal proof is given for this modified theorem because it is similar that 
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1134 of the main text. Recall that F^{n) denotes the computable function corresponding to the 

1135 finite evolutionary system of interest. 

1136 Proof (Sketch): 

1137 Part 1: 3 *C s.t. F|(n + 1) > Vn < 77 ^ 'x e Re' *decidable 

1138 As before, take any input x and find its new code, x. By hypothesis the number of steps 

1139 required is bounded by a constant that is independent of system size. Next, we can begin 

1140 to successively evaluate F^{n) for increasing values of n. We suppose that the number of 

1141 steps required in this computation for any n < 77 is independent of rj. This is a reasonable 

1142 assumption because the outputs are identical to those of 0b (n) when n < 77, and the 

1143 number of steps required to evaluate (j)E{n) is independent of rj for any n. To each output 

1144 of F^{n) we can apply the above mapping, C to obtain F^[n), which by hypothesis, 

1145 increases with n <r). By hypothesis the number of steps required is independent of 77 for 

1146 each such application. 

1147 As we proceed, cither we reach (i) n = 1] prior to reaching an n for which x < F^[n), or we 

1148 reach (ii) a value of n whereby x < F^{n) before n — rj. In either case 'x e i?^,' is then 

1149 decidable because, if x has not been reached by this point, it never will be. Thus, 'x e Re' 

1150 is decidable as well. Moreover, if (i) pertains, then the number of steps required before 

1151 deciding is no more than > x). If (ii) pertains, then this number of steps is exactly 

1152 equal to ^i{(f)^{i) > x). And because > x) is finite and independent of 77, we can 

1153 see that 'x G Re' is *decidable as well. 

1154 Part 2: 'a; G Re' *decidable ^ 3 *C' s.t. Fj(7i + 1) > F|(7i) Vti < 77 

1155 We can construct the required *computable bijection between population states and an 

1156 appropriate coding as follows. First, take any effective coding of population states. By 

1157 hypothesis, the number of steps required to decide G Re' for any x is finite and 
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1158 independent of rj. Thus, we can proceed through the population states, x, in increasing 

1159 order, applying the following algorithm: 

1160 (i) if a; ^ Re and it is the k^^ such state up to that point, use the k^^ odd number as its 

1161 new code. 

1162 (ii) if X e Re, calculate iii{F^{i) — x), and use the i^^ even number as its new code. 

1163 As we proceed though the states, x, the number of steps required for each, regardless of 

1164 whether (i) and (ii) pertains, is independent of rj. Therefore, the entire coding procedure 

1165 for any given state is independent of 77 as well; i.e., the coding is *computable as required. 
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